16 research outputs found
Quantum Chemistry for Solvated Molecules on Graphical Processing Units Using Polarizable Continuum Models
The conductor-like polarization model (C-PCM) with switching/Gaussian smooth discretization is a widely used implicit solvation model in chemical simulations. However, its application in quantum mechanical calculations of large-scale biomolecular systems can be limited by computational expense of both the gas phase electronic structure and the solvation interaction. We have previously used graphical processing units (GPUs) to accelerate the first of these steps. Here, we extend the use of GPUs to accelerate electronic structure calculations including C-PCM solvation. Implementation on the GPU leads to significant acceleration of the generation of the required integrals for C-PCM. We further propose two strategies to improve the solution of the required linear equations: a dynamic convergence threshold and a randomized block-Jacobi preconditioner. These strategies are not specific to GPUs and are expected to be beneficial for both CPU and GPU implementations. We benchmark the performance of the new implementation using over 20 small proteins in solvent environment. Using a single GPU, our method evaluates the C-PCM related integrals and their derivatives more than 10Ă faster than that with a conventional CPU-based implementation. Our improvements to the linear solver provide a further 3Ă acceleration. The overall calculations including C-PCM solvation require, typically, 20â40% more effort than that for their gas phase counterparts for a moderate basis set and molecule surface discretization level. The relative cost of the C-PCM solvation correction decreases as the basis sets and/or cavity radii increase. Therefore, description of solvation with this model should be routine. We also discuss applications to the study of the conformational landscape of an amyloid fibril.United States. Office of Naval Research (N00014-14-1-0590
Exascale Deep Learning for Climate Analytics
We extract pixel-level masks of extreme weather patterns using variants of
Tiramisu and DeepLabv3+ neural networks. We describe improvements to the
software frameworks, input pipeline, and the network training algorithms
necessary to efficiently scale deep learning on the Piz Daint and Summit
systems. The Tiramisu network scales to 5300 P100 GPUs with a sustained
throughput of 21.0 PF/s and parallel efficiency of 79.0%. DeepLabv3+ scales up
to 27360 V100 GPUs with a sustained throughput of 325.8 PF/s and a parallel
efficiency of 90.7% in single precision. By taking advantage of the FP16 Tensor
Cores, a half-precision version of the DeepLabv3+ network achieves a peak and
sustained throughput of 1.13 EF/s and 999.0 PF/s respectively.Comment: 12 pages, 5 tables, 4, figures, Super Computing Conference November
11-16, 2018, Dallas, TX, US
Excited-State Electronic Structure with Configuration Interaction Singles and TammâDancoff Time-Dependent Density Functional Theory on Graphical Processing Units
Excited-state calculations are implemented in a development version of the GPU-based TeraChem software package using the configuration interaction singles (CIS) and adiabatic linear response TammâDancoff time-dependent density functional theory (TDA-TDDFT) methods. The speedup of the CIS and TDDFT methods using GPU-based electron repulsion integrals and density functional quadrature integration allows full ab initio excited-state calculations on molecules of unprecedented size. CIS/6-31G and TD-BLYP/6-31G benchmark timings are presented for a range of systems, including four generations of oligothiophene dendrimers, photoactive yellow protein (PYP), and the PYP chromophore solvated with 900 quantum mechanical water molecules. The effects of double and single precision integration are discussed, and mixed precision GPU integration is shown to give extremely good numerical accuracy for both CIS and TDDFT excitation energies (excitation energies within 0.0005 eV of extended double precision CPU results)
Charge Transfer and Polarization in Solvated Proteins from Ab Initio Molecular Dynamics
Charge transfer at the Bovine pancreatic trypsin inhibitor (BPTI) proteinâwater interface was analyzed by means of ab initio BornâOppenheimer molecular dynamics simulation of the entire protein running on graphical processing units (GPUs). The efficiency of the GPU-based quantum chemistry algorithms implemented in our TeraChem program enables us to perform these calculations on a desktop computer. Mulliken and Voronoi deformation density (VDD) population analysis reveals that between 2.0 and 3.5 electrons are transferred from surrounding water molecules to the protein over the course of the 8.8 ps simulation. Solving for the electronic structure of BPTI in the absence of surrounding water molecules (i.e., in the gas phase) leads to large intraprotein charge transfer, where approximately one electron in total is transferred from neutral to polar residues. Solvation relieves this polarization stress, leading to a neutralization of the excess positive charge of the neutral residues
Generating Efficient Quantum Chemistry Codes for Novel Architectures
We describe an extension of our graphics processing unit
(GPU)
electronic structure program TeraChem to include atom-centered Gaussian
basis sets with d angular momentum functions. This was made possible
by a âmeta-programmingâ strategy that leverages computer
algebra systems for the derivation of equations and their transformation
to correct code. We generate a multitude of code fragments that are
formally mathematically equivalent, but differ in their memory and
floating-point operation footprints. We then select between different
code fragments using empirical testing to find the highest performing
code variant. This leads to an optimal balance of floating-point operations
and memory bandwidth for a given target architecture without laborious
manual tuning. We show that this approach is capable of similar performance
compared to our hand-tuned GPU kernels for basis sets with s and p
angular momenta. We also demonstrate that mixed precision schemes
(using both single and double precision) remain stable and accurate
for molecules with d functions. We provide benchmarks of the execution
time of entire self-consistent field (SCF) calculations using our
GPU code and compare to mature CPU based codes, showing the benefits
of the GPU architecture for electronic structure theory with appropriately
redesigned algorithms. We suggest that the meta-programming and empirical
performance optimization approach may be important in future computational
chemistry applications, especially in the face of quickly evolving
computer architectures
Ab Initio Quantum Chemistry for Protein Structures
Structural properties of over 55 small proteins have
been determined
using both density-based and wave-function-based electronic structure
methods in order to assess the ability of ab initio âforce
fieldsâ to retain the properties described by experimental
structures measured with crystallography or nuclear magnetic resonance.
The efficiency of the GPU-based quantum chemistry algorithms implemented
in our TeraChem program enables us to carry out systematic optimization
of ab initio protein structures, which we compare against experimental
and molecular mechanics force field references. We show that the quality
of the ab initio optimized structures, as judged by conventional protein
health metrics, increases with increasing basis set size. On the other
hand, there is little evidence for a significant improvement of predicted
structures using density functional theory as compared to HartreeâFock
methods. Although occasional pathologies of minimal basis sets are
observed, these are easily alleviated with even the smallest double-ζ
basis sets
Excited-State Electronic Structure with Configuration Interaction Singles and TammâDancoff Time-Dependent Density Functional Theory on Graphical Processing Units
Excited-state calculations are implemented in a development version of the GPU-based TeraChem software package using the configuration interaction singles (CIS) and adiabatic linear response TammâDancoff time-dependent density functional theory (TDA-TDDFT) methods. The speedup of the CIS and TDDFT methods using GPU-based electron repulsion integrals and density functional quadrature integration allows full ab initio excited-state calculations on molecules of unprecedented size. CIS/6-31G and TD-BLYP/6-31G benchmark timings are presented for a range of systems, including four generations of oligothiophene dendrimers, photoactive yellow protein (PYP), and the PYP chromophore solvated with 900 quantum mechanical water molecules. The effects of double and single precision integration are discussed, and mixed precision GPU integration is shown to give extremely good numerical accuracy for both CIS and TDDFT excitation energies (excitation energies within 0.0005 eV of extended double precision CPU results)